histogram estimator
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
Model evidence from nonequilibrium simulations
The marginal likelihood, or model evidence, is a key quantity in Bayesian parameter estimation and model comparison. For many probabilistic models, computation of the marginal likelihood is challenging, because it involves a sum or integral over an enormous parameter space. Markov chain Monte Carlo (MCMC) is a powerful approach to compute marginal likelihoods. Various MCMC algorithms and evidence estimators have been proposed in the literature. Here we discuss the use of nonequilibrium techniques for estimating the marginal likelihood. Nonequilibrium estimators build on recent developments in statistical physics and are known as annealed importance sampling (AIS) and reverse AIS in probabilistic machine learning. We introduce estimators for the model evidence that combine forward and backward simulations and show for various challenging models that the evidence estimators outperform forward and reverse AIS.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Private Statistical Estimation of Many Quantiles
Lalanne, Clément, Garivier, Aurélien, Gribonval, Rémi
This work studies the estimation of many statistical quantiles under differential privacy. More precisely, given a distribution and access to i.i.d. samples from it, we study the estimation of the inverse of its cumulative distribution function (the quantile function) at specific points. For instance, this task is of key importance in private data generation. We present two different approaches. The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution. In particular, we study the statistical properties of the recently published algorithm introduced by Kaplan et al. 2022 that privately estimates the quantiles recursively. The second approach is to use techniques of density estimation in order to uniformly estimate the quantile function on an interval. In particular, we show that there is a tradeoff between the two methods. When we want to estimate many quantiles, it is better to estimate the density rather than estimating the quantile function at specific points.
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > France (0.04)
- (17 more...)
Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation
Vandermeulen, Robert A., Ledent, Antoine
The construction and theoretical analysis of the most popular universally consistent nonparametric density estimators hinge on one functional property: smoothness. In this paper we investigate the theoretical implications of incorporating a multi-view latent variable model, a type of low-rank model, into nonparametric density estimation. To do this we perform extensive analysis on histogram-style estimators that integrate a multi-view model. Our analysis culminates in showing that there exists a universally consistent histogram-style estimator that converges to any multi-view model with a finite number of Lipschitz continuous components at a rate of $\widetilde{O}(1/\sqrt[3]{n})$ in $L^1$ error. In contrast, the standard histogram estimator can converge at a rate slower than $1/\sqrt[d]{n}$ on the same class of densities. We also introduce a new nonparametric latent variable model based on the Tucker decomposition. A rudimentary implementation of our estimators experimentally demonstrates a considerable performance improvement over the standard histogram estimator. We also provide a thorough analysis of the sample complexity of our Tucker decomposition-based model and a variety of other results. Thus, our paper provides solid theoretical foundations for extending low-rank techniques to the nonparametric setting
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (4 more...)
Improving Nonparametric Density Estimation with Tensor Decompositions
While nonparametric density estimators often perform well on low dimensional data, their performance can suffer when applied to higher dimensional data, owing presumably to the curse of dimensionality. One technique for avoiding this is to assume no dependence between features and that the data are sampled from a separable density. This allows one to estimate each marginal distribution independently thereby avoiding the slow rates associated with estimating the full joint density. This is a strategy employed in naive Bayes models and is analogous to estimating a rank-one tensor. In this paper we investigate whether these improvements can be extended to other simplified dependence assumptions which we model via nonnegative tensor decompositions. In our central theoretical results we prove that restricting estimation to low-rank nonnegative PARAFAC or Tucker decompositions removes the dimensionality exponent on bin width rates for multidimensional histograms. These results are validated experimentally with high statistical significance via direct application of an existing nonnegative tensor factorization to histogram estimators.
- North America > United States > New York > New York County > New York City (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Germany > Berlin (0.04)
- (2 more...)
Model evidence from nonequilibrium simulations
The marginal likelihood, or model evidence, is a key quantity in Bayesian parameter estimation and model comparison. For many probabilistic models, computation of the marginal likelihood is challenging, because it involves a sum or integral over an enormous parameter space. Markov chain Monte Carlo (MCMC) is a powerful approach to compute marginal likelihoods. Various MCMC algorithms and evidence estimators have been proposed in the literature. Here we discuss the use of nonequilibrium techniques for estimating the marginal likelihood. Nonequilibrium estimators build on recent developments in statistical physics and are known as annealed importance sampling (AIS) and reverse AIS in probabilistic machine learning. We introduce estimators for the model evidence that combine forward and backward simulations and show for various challenging models that the evidence estimators outperform forward and reverse AIS.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Estimating $\beta$-mixing coefficients
McDonald, Daniel J., Shalizi, Cosma Rohilla, Schervish, Mark
The literature on statistical learning for time series assumes the asymptotic independence or ``mixing' of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing rates from data. We give an estimator for the $\beta$-mixing rate based on a single stationary sample path and show it is $L_1$-risk consistent.
- North America > United States > New York (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Fast Conditional Density Estimation for Quantitative Structure-Activity Relationships
Buchwald, Fabian (Technische Universität München) | Girschick, Tobias (Technische Universität München) | Frank, Eibe (University of Waikato) | Kramer, Stefan (Technische Universität München)
Many methods for quantitative structure-activity relationships (QSARs) deliver point estimates only, without quantifying the uncertainty inherent in the prediction. One way to quantify the uncertainy of a QSAR prediction is to predict the conditional density of the activity given the structure instead of a point estimate. If a conditional density estimate is available, it is easy to derive prediction intervals of activities. In this paper, we experimentally evaluate and compare three methods for conditional density estimation for their suitability in QSAR modeling. In contrast to traditional methods for conditional density estimation, they are based on generic machine learning schemes, more specifically, class probability estimators. Our experiments show that a kernel estimator based on class probability estimates from a random forest classifier is highly competitive with Gaussian process regression, while taking only a fraction of the time for training. Therefore, generic machine-learning based methods for conditional density estimation may be a good and fast option for quantifying uncertainty in QSAR modeling.
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Research Report > New Finding (0.69)
- Research Report > Experimental Study (0.68)